A Two-Stage Approach for Generating Unbiased Estimates of Text Complexity
نویسندگان
چکیده
Many existing approaches for measuring text complexity tend to overestimate the complexity levels of informational texts while simultaneously underestimating the complexity levels of literary texts. We present a two-stage estimation technique that successfully addresses this problem. At Stage 1, each text is classified into one or another of three possible genres: informational, literary or mixed. Next, at Stage 2, a complexity score is generated for each text by applying one or another of three possible prediction models: one optimized for application to informational texts, one optimized for application to literary texts, and one optimized for application to mixed texts. Each model combines lexical, syntactic and discourse features, as appropriate, to best replicate human complexity judgments. We demonstrate that resulting text complexity predictions are both unbiased, and highly correlated with classifications provided by experienced educators.
منابع مشابه
The Impact of Summary Writing with Structure Guidelines on EFL College Students’ Rhetorical Organization: Integrating Genre-Based and Process Approaches
This study aimed at investigating the impact of writing on Iranian EFL college students’ rhetorical organization. Thirty Iranian female undergraduate students majoring in English at Al-zahra University participated in the current study. The writing instructions included two stages, each lasting for four weeks. The participants were assigned to a control group and an experimental group according...
متن کاملA New Approach Generating Robust and Stable Schedules in m-Machine Flow Shop Scheduling Problems: A Case Study
This paper considers a scheduling problem with uncertain processing times and machine breakdowns in industriall/office workplaces and solves it via a novel robust optimization method. In the traditional robust optimization, the solution robustness is maintained only for a specific set of scenarios, which may worsen the situation for new scenarios. Thus, a two-stage predictive algorithm is prop...
متن کاملStage specialization for design and analysis of flotation circuits
This paper presents a new approach for flotation circuit design. Initially, it was proven numerically and analytically that in order to achieve the highest recovery in different circuit configurations, the best equipment must be placed at the beginning stage of the flotation circuits. The size of the entering particles and the types of streams including pulp and froth were considered as the bas...
متن کاملBi-objective Optimization for Just in Time Scheduling: Application to the Two-Stage Assembly Flow Shop Problem
This paper considers a two-stage assembly flow shop problem (TAFSP) where m machines are in the first stage and an assembly machine is in the second stage. The objective is to minimize a weighted sum of earliness and tardiness time for n available jobs. JIT seeks to identify and eliminate waste components including over production, waiting time, transportation, inventory, movement and defective...
متن کاملSyntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity
In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” (https://neznaika.pro/). The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’...
متن کامل